Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis
نویسندگان
چکیده
Background The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Results Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Conclusions Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics.
منابع مشابه
Genome-wide Association Study to Identify Genes and Biological Pathways Associated with Type Traits in Cattle using Pathway Analysis
Extended Abstract Introduction and Objective: Type traits describing the skeletal characteristics of an animal are moderately to strongly genetically correlate with other economically important traits in cattle including fertility, longevity and carcass traits. The present study aimed to conduct a genome wide association studies (GWAS) based on gene-set enrichment analysis for identifying the ...
متن کاملInnovation Contests - Where are we?
Innovation contests in their basic structure have a long-standing tradition and can be attributed to continuously gain in importance as a corporate practice. A deep understanding of this online instrument, however, is still lacking. Contrary to other methods used to realize open innovation, research in the field of online innovation contests displays a growing, but only rudimentarily intertwine...
متن کاملHow to Design Prizes - Incentives in Innovation Contests
This research paper focuses on the use of prizes in innovation contests as means to encourage participation and efforts in open innovation activities. Building on a systematic review of 69 innovation contests, we investigate differences in the design of prizes between innovation contests focusing on ideation and those with a development orientation. Results show a very heterogeneous picture, in...
متن کاملBeyond Innovation Contests: a Framework of Barriers to Open Innovation of Digital Services
Recently, the interest in the innovation of digital services based on open public information (i.e. open data) has increased dramatically. Innovation contests, such as idea competitions and digital innovation contests, have become popular instruments to accelerate the development of new service ideas and prototypes. However, only a few of the service prototypes developed at innovation contests ...
متن کاملInnovation Contests : Systematization of the Field and Future Research
The ability to generate innovative products and services is a critical success factor for organizations. The trend of open innovation has brought about many-faceted, IT-based tools (e.g., lead user method or online tool kits), among these, the innovation contest seems particularly promising and continuously gains in importance as a corporate practice. However, a deep understanding of this onlin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 6 شماره
صفحات -
تاریخ انتشار 2017